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Abstract 

We prove that any oblivious algorithm using space S to find the median of a list of n integers 
from {1,..., 2n} requires time ^(nloglogg n). This bound also applies to the problem of determining 
whether the median is odd or even. It is nearly optimal since Chan, following Munro and Raman, has 
shown that there is a (randomized) selection algorithm using only s registers, each of which can store an 
input value or 0(log n)-bit counter, that makes only 0(log log^ n) passes over the input. The bound also 
implies a size lower bound for read-once branching programs computing the low order bit of the median 
and implies the analog of P 7 ^ N P n coN P for length o(n log log n) oblivious branching programs. 


1 Introduction 

The problem of selection or, more specifically, finding the median of a list of values is one of the most 
basic computational problems. Indeed, the classic deterministic linear-time median-finding algorithm of 
1*9], as well as the more practical expected linear-time randomized algorithm QuickSelect are among the 
most widely taught algorithms. 

Though these algorithms are asymptotically optimal with respect to time, they require substantial ma¬ 
nipulation and re-ordering of the input during their execution. Hence, they require the ability to write into 
a linear number of memory cells. (These algorithms can be implemented with only 0(1) memory locations 
in addition to the input if they are allowed to overwrite the input memory.) In many situations, however, the 
input is stored separately and cannot be overwritten unless it is brought into working memory. The number 
of bits S of working memory that an algorithm with read-only input uses is its space. This naturally leads 
to the question of the tradeoffs between the time T and space S required to find the median, or for selection 
more generally. 

Munro and Paterson ifTSTl gave multipass algorithms that yield deterministic time-space tradeoff up¬ 
per bounds for selection for small space algorithms and showed that the number of passes p must be 
H(loggn) where S = slog 2 re. Building on this work, Frederickson ifTdll extended the range of space 
bounds to nearly linear space, deriving a multipass algorithm achieving a time-space tradeoff of the form 
T = 0(n log* n -|- nlog^ n). In the case of randomly ordered inputs, Munro and Raman |[T9l showed that 
on average an even better upper bound o^ p = ©(loglog^n) passes and hence T = 0(n log log^n) is 
possible. Chakrabarti, Jayram, and Patra§cu ifT^ showed that this is asymptotically optimal for multipass 
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computations on randomly ordered input streams. Their analysis also applied to algorithms that perform 
arbitrary operations during their execution. 

Chan ifT^ showed how to extend the ideas of Munro and Raman ifT^ to yield a randomized median¬ 
finding algorithm achieving the same time-space tradeoff upper bound as in the average case that they 
analyze. The resulting algorithm, like all of those discussed so far, only accesses its input using comparisons. 
Chan coupled this algorithm with a corresponding time-space tradeoff lower bound of T = Q{n log log^ re) 
for randomized comparison branching programs, which implies the same lower bound for the randomized 
comparison RAM model. This is the first lower bound for selection allowing more than multipass access 
to the input; the input access can be input-dependent but the algorithm must base all its decisions on the 
input order. Though a small gap remains because S ^ s, the main question left open by ifT^ is that of 
finding fime-space fradeoff lower bounds for median-finding algorifhms fhaf are nof resfricfed fo fhe use of 
comparisons. 

Comparison-based versus general algorithms Though comparison-based algorifhms for selecfion may be 
nafural, when fhe inpuf consisfs of an array of 0(logre)-bif infegers, as one offen assumes, fhere are nafural 
alfemafives fo comparisons such as hashing fhaf mighf pofenfially yield more efficienl algorifhms. Though 
comparison-based algorifhms mafch fhe known fime-space fradeoff lower bounds in efficiency for sorting 
when time T is fl(relogre) llT0l l4ll^. fhey are powerless in fhe regime when T is o(relogre). Moreover, if 
one considers fhe closely relafed problem of elemenf disfincfness, defermining whefher or nof fhe inpuf has 
duplicafes, fhe known fime-space fradeoff lower bound of T = /S) for (randomized) comparison 

branching programs ll22]| can be beafen for S up fo n^~°W by an algorifhm using hashing fS) fhaf achieves 
T = Therefore, fhe resfricfion fo comparison-based algorifhms can be a significanf limifa- 

fion on efficiency. 

Our results We prove a fighf T = Q{n log log^ re) lower bound for median-finding using arbifrary oblivious 
algorifhms. Oblivious algorifhms are fhose fhaf can access fhe dafa in any order, nof jusf in a fixed number 
of sweeps across fhe inpuf, buf fhaf order cannof be dafa dependenf. Our lower bound applies even for fhe 
decision problem of computing MedianBit, fhe low order bif of fhe median, when fhe inpuf consisfs of re 
infegers chosen from 2re}. This bound subsfanfially generalizes fhe lower bound of lIT^ for mul¬ 

tipass median-finding algorifhms. Though our lower bound does nof apply when fhere is inpuf-dependenf 
access fo fhe inpuf, if allows one fo hash fhe inpuf dafa values info working sforage, and fo organize and 
manipulafe working sforage in arbifrary ways. 

The median can be compufed by a simple nondeferminisfic oblivious read-once branching program of 
polynomial size fhaf guesses and verifies which inpuf infeger is fhe median. When expressed in ferms of 
size for fime-bounded oblivious branching programs our lower bound Iherefore shows fhaf for every time 
bound T fhaf is o(reloglogre), MedianBit and ifs complemenf have nondeferminisfic oblivious branch¬ 
ing programs of polynomial size buf MedianBit requires super-polynomial size deterministic oblivious 
branching programs, hence separating fhe analogs of P from NP n coNP. 

We derive our lower bound using a reducfion from a new communicafion complexify lower bound for 
fwo players fo find fhe low order bif of median of fheir join! sef of inpuf infegers in a bounded number of 
rounds. The use of communicafion complexify lower bounds in fhe “besf partition” model fo derive lower 
bounds for oblivious algorifhms is nof new, buf fhe necessify of bounded rounds is. We derive our bound via a 
round-preserving reducfion from oblivious compulation fo besl-parfifion communicafion complexify Il20l l2]|. 
This reducfion is asympfolically less efficienl lhan fhe reducfions of ||3l [TTl buf fhe lalfer do nof preserve 
fhe number of rounds, which is essential here since fhere is a very efficienl 0(logre)-bil communication 

*We use O and fi notations to hide logarithmic factors. 
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protocol using an unbounded number of rounds ifTTl . Moreover, the loss in efficiency does not prevent us 
from achieving asymptotically optimal lower bounds. 

We further show that the fact that the median function is symmetric in its inputs implies that our oblivious 
branching program lower bound also applies to the case of non-oblivious read-once branching programs. 
Ideally, we would like to extend our non-oblivious results to larger time bounds. However, we show that 
extending our lower bound even to read-twice branching programs in the non-oblivious case would require 
fundamentally new lower bound techniques. The hardness of the median problem is essentially that of 
a decision problem: Though the median problem has 0(logn) bits of output, the high order bits of the 
median are very easy to compute; it is really the low order bit, MedianBit, that is the hardest to produce and 
encapsulates all of the difficulty of the problem. Moreover, all current methods for time-space tradeoff lower 
bounds for decision problems on general branching programs, and indeed for read-Zc branching programs 
for fe > 1, also apply to nondeterministic algorithms computing either the function or its complement and 
hence cannot apply to the median because it is easy for such algorithms. 

2 Preliminaries 

Let D and R be finite sets. We first define branching programs fhaf compufe functions / : R: A 

D-way branching program is a connecfed direcfed acyclic multigraph wifh special nodes: fhe source node 
and possibly many sink nodes, a sequence of n inpuf values and one oufpuf. Each non-sink node is labeled 
wifh an inpuf index and every edge is labeled wifh a symbol from D, which corresponds fo fhe value of fhe 
inpuf indexed af fhe originafing node; fhere is precisely one ouf-edge from each non-sink node labeled by 
each elemenf of D. We assume fhaf each sink node is labeled by an elemenf of R. The fime T required by a 
branching program is fhe lengfh of fhe longesf pafh from fhe source fo a sink and fhe space S is log 2 of fhe 
number of nodes in fhe branching program. A branching program is leveled iff all fhe pafhs from fhe source 
fo any given node in fhe program are of fhe same lengfh; a branching program can be leveled by adding af 
mosf log 2 T fo ifs space. 

A branching program B compufes a function fs : D^ Rby sfarfing af fhe source and fhen proceeding 
along fhe nodes of fhe graph by querying fhe inpuf locations associafed wifh each node and following fhe 
corresponding edges unfil if reaches a sink node; fhe label of fhe sink node is fhe oufpuf of fhe function. 

A branching program is oblivious iff on every pafh from fhe source node fo a sink node, fhe sequence of 
inpuf indices is precisely fhe same. If is (synfacfic) read-k iff no inpuf index appears more fhan k fimes on 
any pafh from fhe source fo a sink. 

Branching programs can easily simulafe any sequential model of compufafion using fhe same fime and 
space bounds. In particular branching programs using lime T and space S can simulafe random-access 
machine (RAM) algorilhms using time T measured in fhe number of inpuf locations queried and space 
S measured in fhe number of bifs of read/wrife sforage required. The same applies fo fhe simulafion of 
randomized RAM algorilhms by randomized branching programs. 

We also find if useful fo discuss nondeferminislic branching programs for (non-Boolean) functions, 
which simulafe nondeferminislic RAM algorilhms for funclion compufafion. These have fhe properly fhaf 
mulliple ouledges from a single node can have fhe same label and ouledges for some labels may nol be 
presenl. Every inpuf musl have af leasl one pafh fhaf leads fo a sink and all pafhs followed by an inpuf 
veclor fhaf lead fo a sink musl lead fo fhe same one, whose label is fhe oufpuf value of fhe program. This is 
differenl from fhe usual version for decision problems in which one only considers accepting pafhs and infers 
fhe oufpuf value for Ihose fhaf are nol accepting. When we consider Boolean funclions we will lypically 
assume fhe usual version based on accepting pafhs only. 
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We consider bounded-round versions of deterministic and randomized two-party communication com¬ 
plexity in which two players Alice and Bob receive x G A and y ^ y and cooperate to compute a function 
f:X X y ^ Z. A round in a protocol is a maximal segment of communication in which the player who 
speaks does not change. For a distribution V on X x 3^, we say that a 2-party deterministic communication 
protocol computes / with error at most e < 1/2 under V iff the probability over V that the output of the 
protocol on input (x, y) ~ "D is equal to /(x, y) is at least 1 — e. As usual, via Yao’s lemma, for any such 
distribution V, the minimum number of bits communicated by any deterministic protocol that computes / 
with error at most £ is a lower bound on the number of bits communicated by any (public coin) randomized 
protocol that computes / with error at most s. 

We say that a 2-party deterministic communication protocol has parameters [P, e; mi, m 2 , ... ] for / 
over a distribution T> if: 

• the first player to speak is P G {A, B}; 

• it has error e < 2 under input distribution V; 

• the players alternate turns, sending messages of mi, m 2 , ■ ■ ■ bits, respectively. 

For probability distributions P and Q on a domain U, the statistical distance between P and Q, is 
11^ ~ Q\\ = niax/ict/ |P(A) — Q{A)\, which is 1/2 of the Li distance between P and Q. Let log denote 
log 2 unless otherwise specified, i Lef H{X) be fhe binary enfropy of random variable X, H{X\Y) = 
Ej/ ^yH{X\Y= y), and lef I{X;Y\Z) be fhe mufual information befween random variables X and Y con¬ 
ditioned on random variable Z. We have I{X-,Y\Z) <H{X\Z) <H{X). 

3 Round Elimination 

Lef / : A X 3 / —^ {0,1} and consider a disfribufion V on X x y. We define fhe 2-player communication 
problem as follows: Alice receives x G X^, while Bob receives y G y^ and j G [k]', fogefher fhey wanf 
fo find f{xj, yj). Also, given T> we define an inpuf disfribufion for by choosing each (xj, yi) pair 
independenfly from V, and independenfly choosing j uniformly from [k]. 

The following lemma is a varianf of sfandard fechniques and was suggesfed fo us by Anup Rao; ifs proof 
is in fhe appendix for complefeness. 

Lemma 1. Assume that there exists a 2-party deterministic protocol for with parameters 
[A, e; mi, m 2 , m 3 ,... ] overPW where mi = (5^A:/(81n2). Then there exists a 2-party deterministic pro¬ 
tocol for f with parameters \B,e -\- 5; m 2 , m 3 ,... ] over V. 

The infuifion for fhis lemma is fhaf, since has k independenf copies of fhe function / and Alice’s 
firsl message has lengfh af mosf mi which is only a small fraction of k, fhere musf be some copy of / on 
which B learns very liffle information. This is so much less fhan one bif fhaf B could forego fhis information 
in computing / and still only lose 5 in his probability of correcfness. The quadrafic difference befween fhe 
number of bifs of information per copy, (8 In 2), and fhe probabilify difference, S, comes from Pinsker’s 
inequalify which relafes informafion and sfafisfical disfance. 

4 The Bounded-Round Communication Complexity of 
(the Least-Significant Bit of) the Median 

We consider fhe complexify of fhe following communication game. Given a sef A of n elemenfs from [2n] 
partitioned equally befween Alice and Bob, defermine fhe leasf significanf bif of fhe median of A. (Since n 
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Figure 1: Recursive construction of the pairing for the hard instances. 


must be even in order for A to be partitioned evenly, we take the median to be n/2-th largest element of A.) 
We consider the number of rounds of communication required when the length of each message is at most 
m for any m > log n. 

A Hard Distribution on Median Instances 

For our hard instances we first define a pairing of fhe elemenfs of [2n] fhaf depends on fhe value of m. The 
sef A will include precisely one elemenf from each pair. For fhe inpuf fo fhe communication problem, we 
randomly parfifion fhe pairs equally befween fhe fwo players which will Iherefore also aufomafically equally 
partition fhe sef A. We fhen show how fo randomly choose one elemenf from each pair fo include in A. 

In fhe consfrucfion, we define fhe pairing of [2n] recursively; fhe paramefers of each recursive pairing 
will depend on fhe initial value no of n. Lef k = fc(m, no) = mlog^no. If i/n < /clog^no then the 
elements of [1, 2n] are simply paired consecutively. If yTr > k log^ no then the pairing of [2n] consists of a 
“core” of 7 = -y/n/ log^ no pairs, plus n — 7 “shell” pairs on [1, n — 7 ] U [n + 1 + 7 , 2n]. In the shell, i and 
2n +1 — i are paired. The core pairs are obtained by embedding k recursive instances (using the same values 
of m and no) of n' = ^ pairs each on consecutive sets of ^ elements, and placing them back-to-back in the 
value range [n — 7 -)- 1, n -t- 7 ], see Figure The size of the problem at each level of recursion decreases 
from n to n' = 7 /A: = yTi/(mlog^ no). In determining the median, the only relevant information about the 
shell elements is how many are below n; let this number be | — x. If x G [1, 7 ], the median of the entire 
array A will be the x-th order statistic of the core. 

If furthermore, x = ~ 5 ) for an integer j, the median of A will be exactly the median of the y-th 

embedded subproblem. In our distribution of hard instances, we will ensure that x has this nice form. 

Formally, the distribution of the hard instances A of size n on [2n] is the following. Generate 

k recursive instances on and place shifted versions of them back-to-back inside the core. Choose 

j G [A:] uniformly at random. Choose f ~ ^ (j ~ 5 ) uniformly random shell elements in [1, n — 7 ] to include 
in A\ for every i G [1, n — 7 ] \ A, we have 2n -|-1 — z G A. This will ensure that the median of A is precisely 
the median of the j-th recursive instance inside the core. 

Initially we have n = no and the recursion only continues when 7 = ^/n/ log^ no > k log no, so in the 
base case we have at least log no elements. In this case, the i-th element is chosen randomly and uniformly 
from the paired elements 2 i — 1 and 2i and so the least significant bit of the median is uniformly chosen in 
{0,1}. 

The size of the problem after t levels of recursion remains at least n^^ /(mlog^no)^”^'^^* ^ and our 
definition gives at least t levels provided that this size n^^ /(mlog^ no)^“^'^^* ^ > log no; i.e., no > 
^ 2 ‘+ 1-2 jQg 9 . 2‘-2 Qjjg message for each level of recursion, the answer is still 

not determined. 

The general idea of the lower bound is that each round of communication, which consists of at most 
m bits and is much smaller than the branching factor k, will give almost no information about a typical 


5 







recursive subproblem in the core. 

We use the round elimination lemma to make this precise, and with it derive the following theorem: 

Theorem 2. If, for A chosen according to „ and partitioned randomly, Alice and Bob determine the 
least significant bit of the median of A with bounded error e < 1/2 using t messages of at most m > logn 
bits each, then > n/ log®'^ n, which implies that t > log log^ n — cfor some constant c. 

The Partition Between the Players 

To ensure that neither player has enough information to skip a level of the recursion, we insist that the shell 
for each subproblem be nicely partitioned between the two players. For any given shell there is a set of 
n' > rn?/2 > O.hlog^no shell pairs. Since a player receives a random 1/2 of all pairs, by Hoeffding’s 
inequality, with probability ), which is TOq pairs go to each player. We can use this 

to say that with high probability at least 1/3 of all shell elements at a level go to each player at every level 
of the recursion: This follows easily because over all levels of the recursive pairing, there are only a total of 
o(yTio) different shells associated with subproblems and each one fails only with probability 

From now on, fix a partition satisfying the above requirement at all recursion nodes. We will prove a 
lower bound for any partition satisfying this property. Since we are discarding o(l) of possible partitions, 
the error of the protocol may increase by o(l), which is negligible. 

The Induction 

Our proof of Theoremj^will work by induction, using the following message elimination lemma: 

Lemma 3. Assume that there is a protocol for the median on instances of size n, with error e on 25^,710 
for ^/n > k log uq = m log^ ng, using t messages of size at most m starting with Alice. Then, there is a 
protocol for a subproblem of size y/k, with error e + ) on Pm,no. using t — 1 messages of size at 

most m starting with Bob. 

We use Lemmato prove Theorem]^ by inductively eliminating all messages. Let ng = n. At each 
application we remove one message to get an error increase of If the number of rounds is less 

than the number of levels of recursion, i.e., < n/log®'^*“^ n, then the MedianBit value of 

the subproblem will still be a uniformly random bit on the remaining input, but the protocol will have no 
communication and the error will have increased to at most e + <1/2 since t is 0(loglog^n), 

which is a contradiction. 

To prove Lemmaj^we want to apply Lemma[^using the k subproblems in the core, but the assumption 
of Lemma [^requires that (1) Alice does not know anything about which subproblem j € [/c] is chosen by 
Bob, and (2) that subproblem j is chosen uniformly at random. The choice of subproblem j is determined 
by the shell elements at this level. 

Denote Alice’s shell elements by x^, and Bob’s shell elements by y®. Let Alice’s part of the core 
subproblems be xi,..., x^, and Bob’s part be yi,..., y^. Note that the choice of the relevant subproblem j 
is some function of (x^, y^), and the median of the whole array is the median of xj U yj. 

The proof of Lemma [^proceeds in two stages: 

Fixing X®. We first fix fhe value of x^ so fhaf fhe choice of subproblem does nol depend on Alice’s inpuf 
and, moreover, so fhaf fhe probabilifies for differenl values of j over Bob’s inpuf y® will nol be very differenl 
from each olher because Ihey are sfill near fhe middle binomial coefficienls. 
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By the niceness of the partition of the pairs, we know that the number of Alice’s shell pairs is |x^| G 
[|(n — 7 ), |(n — 7 )]. Let a be the number of elements in that are below n. We want to fix such that 
the error does not increase too much, and |a — ^ \ < y/n ■ log uq: 

No matter which value of j G [k] is chosen in the input distribution, the shell elements chosen to be 
below n consist of a random subset of x® U of a fixed size thaf is befween n/2 — 7 and n/2 + 7 ; i.e., of 
fracfional sizepj befween ^ ^ and \ + '^- By Hoeffding’s inequalify, the probability that the actual number 

a of these elements that land in x® deviates from |x®|/2 by more than (f + ^)|x^| is at most Since 

(n — 7)/3 < |x®| < 2(n — 7 )/ 3 , the probability that this deviates from |x^|/2 by more than y/n log tiq is 
at most tiq We discard all values of x^ that lead to a outside this range. Now fix x^ fo be the value 

that minimizes the conditional error. 

Making j uniform. Once x^ is fixed, j is a funcfion only of y*. Thus, we are close fo the setup of 
LemmaAlice receives xi,..., x^. Bob receives yi,..., and j G [/c], and they want to compute a 
function f{xj,yj). The only problem is that the lemma requires a uniform distribution of j, whereas our 
distribution is no longer uniform (having fixed x^). However, we will argue thaf if is nol far from uniform. 
For each fixed jo £ [k], if a shell elemenfs from Alice’s part are below n, fhen Bob must have ^ — a — 

?(jo — i) shell elements below n. Therefore, Pr[j = jo] is proportional to f \ ■ i\ ) • More 

V 2 “ ® ~ fc wO ~ 2 )/ 

precisely Pr[j = jo] is this binomial coefficient divided by the sum of the coefficients for all jo. Thus, to 
understand how close j is to uniform, we must understand the the dependence of these binomial coefficients 
on jo. 

Let A = a — lx®j/2. This satisfies jA] < y^logreo- Since jy®j = n — jx^j > > n/4 we have 

(zi-aJiQp-i)) = (|j/q/ 2 -A -(5 ) "'here 0 < Jj-q < 7 . Assume wlog fhaf A > 0. The ratio between 
different binomial coefficients is at most the ratio 

/ re/4 \ / re/4 \ _ (re /8 + A + 7 ) • • • (re /8 + A + 1 ) 

\re /8 —Ay \re /8 —A — 7 / (re /8 — A) • • • (re /8 — A — 7 + 1 ) 

^ ^ 10 ( 2 A + 7 ) y 

which is 1 + 0{^) = 1 + ) given the values of A and 7 . 

Therefore we have shown that the statistical distance between the induced distribution on j and the 
uniform distribution is O ( 55 ^ 77 ) ■ We can thus consider the following alternative distribution for the problem: 
pick j uniformly at random, and manufacture conditioned on this j. The error on the new distribution 
increases by at most 0 ( 17 ^ 77 ). 

Now we can apply the round elimination lemma, Lemma[^ As k > m log^ reo, the lemma will increase 
the error by O(j^). 

5 Oblivious Branching Programs and the Median 

The following result is essentially due to Okol’nishnikova EOll . who used it with slightly different param¬ 
eters for read-A; branching programs, and was independently derived by Ajtai E]] in the context of general 
branching programs. 
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Proposition 4. Let s be a sequence of of kn elements from [n]. If s is divided into r = Akf segments 
si^... ,Sr, each of length n/{Ak), then there is an assignment of 2k segments sj to a set La and all re¬ 
maining segments Sj to Lb so that the number ua (ub) of elements of [n] whose only appearances are in 
segments in La (respectively, Lb) satisfy ua > ''^/( 2 (^ 2 fc)) 

Proof. There is a subset V of at least n/2 elements of [n] that occur at most 2k times in s and hence appear 
in at most 2k segments of s. Choose the 2k sets Sj to include in La uniformly at random. For a given 
i £ V, i will contribute to ha if and only if all of the the at most 2k segments that contain its occurrences 
are chosen for La- This occurs with probability at least 1 / ( 2 *^); hence the expected number of elements in 
V that only occur in segments of La is at least \ V\/ Therefore we can select a fixed assignment that 
contains has at least this number. Since the total length of segments in La is at most 2kn/(Ak) < n/2, at 
least n/2 elements of [n] only occur in segments m Lb- □ 

Lemma 5. Suppose that there is a 2n-way oblivious branching program of size 2^ running in time T = kn 
that computes MEDIANBiT/or n distinct inputs from [2n]. Then there is deterministic 2-party communica¬ 
tion protocol using at most Ak messages of S bits each plus a final 1-bit message to compute MedianBit 
for N = \n/(/^ 2 k)~\ distinct inputs from [2N] that are divided evenly between the two players. 

Proof. Let s be the length T sequence of indices of inputs queried by the oblivious branching program. 
Let k = T/n, r = Akf, and N = \'n/[ 2 f/)). Fix the assignment of segments to La and Lb given by 
Proposition]^ Arbitrarily select a subset of A^/2 of the nA indices that only appear in La and give those 
inputs to player A. Similarly, select a subset of A^/2 of the ns indices that only appear in Ls and give 
those inputs to player B. Let Q be the remaining set of n — input indices. 

Fix any input assignment to the indices in Q that assigns (n — N)/2 distinct values from [n — A^] to 
half the elements of Q and the same number of distinct values from [n + A^ + 1, 2n] to the other half of 
the elements of Q. After fixing fhis parfial assignmenf we resfricf fhe remaining inpufs fo have values in fhe 
segmenf [n — A^ + l,n + A']of lengfh 2N. 

The communicafion profocol is derived as follows: Alice (resp. Bob) inferprefs her N/2 inpufs from 
[2A"] as assignmenfs from [2n] fo fhe elemenfs of Ia (resp. Ib) by adding n — N to each value. Alice 
will simulafe fhe branching program execufing fhe segmenfs in La and Bob will simulate fhe branching 
program execufing fhe segmenfs in L^. A player will confinue fhe simulafion until fhe nexf segmenf is held 
by fhe ofher player, af which poinf fhaf player communicates fhe name of fhe node in fhe branching program 
reached af fhe end of ifs layer. Since La has only 2k segmenfs, fhere are af mosf Ak alfemafions befween 
players as well as fhe final oufpuf bif which gives fhe fofal communicafion. By consfrucfion, fhe median 
of fhe whole problem is fhe median of fhe N elemenfs and fhe final answer for MedianBit on [2A^] is 
compufed by XOR-ing fhe resulf wifh fhe low order bif of n — A^. □ 

Theorem 6. Any oblivious branching program computing MedianBit/ot n inputs from [2n] in time T < 

kn requires size at least >; in particular, if it uses space S, any oblivious branching program 

requires time T > 0.25n log log^ n — c nfor some constant c. 

Proof. Since T/n < k, applying Lemmaj^we derive a 2-parly communicafion profocol sending t = Ak A-1 
messages of af mosf S > logn bifs each fo compute MedianBit on N > n/{/^ 2 k ) — n/(2ek)‘^^ inpufs 
from [2A^]. By Theorem]^ S > iQg(9-2‘-2)/(2*+i-2) ^ ^ jYi/( 2 '^''+ 2 - 2 )y j^gTi/is 

t > 4 and hence S > j jQgS ^i. The size of fhe branching program is 2'^ where S is ifs space. 

Moreover, faking logarilhms base S and Ihen base 2 we have Ak > log log^ n — c' for some consfanf c'. □ 


Analog of P 7^ NP n coNP for time-bounded oblivious BPs 

Corollary 7. Any oblivious branching program of length T < kn computing the low order bit of the median 
requires size at least in particular, this size is super-polynomial when T is o{n log log n). 

On the other hand, the median can be computed by a nondeterministic oblivious read-once branching 
program using only 0(log n) space. 

Lemma 8. There is a nondeterministic oblivious read-once branching program of size O(n^) that computes 
the median on n integers from [2n\. 

Proof The branching program guesses the value of the median in [2n] and keeps track of the number of 
elements that it has seen both less than the median and equal to the median in order to check that the value is 
correct. Other than the source and sink nodes there is one node of the branching program for each (z, m, e) 
for m G [2n], i G [n] such that 0 < £ -\- e < min(i, (n + l)/2 + 1). The source node which queries xi is 
the only node to have multiple outedges with the same value label. It has (2n)^ outedges, 2n for each value, 
one corresponding to each of the median value guesses. If at a node {i, m, I, e), the values i, £, e together 
with the value j of Xj+i are inconsistent with m being the median then the outedge for j is not present. □ 

In particular, in contrast to Corollary Lemma [^implies that MedianBit can be computed in poly¬ 
nomial size by length n nondeterministic and co-nondeterministic oblivious branching programs, hence we 
have shown the analog of P / NP n coNP for oblivious branching programs of length o(nloglogn). 

6 Beyond Oblivious Branching Programs 

We first observe that our lower bounds for the median problem extend to the case of read-once branching 
programs by using the fact that such programs for the median can also be assumed to be oblivious without 
loss of generality. (Oblivious read-once branching programs are also known as ordered binary decision 
diagrams (OBDDs).) 

Lemma 9. If f : —)• R is a symmetric function of its inputs then for every read-once branching B 

computing f there is an oblivious read-once branching program, of precisely the same size as B, that 
computes f. 

Proof With each node r; in a read-once branching program, we can associate a set R C [n] of input indices 
that are read along paths from the source node to v. We make B into an oblivious branching program 
by replacing the index at node n by |/^,| -|- 1. This yields an oblivious read-once branching program (not 
necessarily leveled) that reads its inputs in the order xi,X 2 , ■ ■ ■ ,Xn along every path (possibly skipping 
over some inputs on the path). Since / is a symmetric function, a path of length t < n in B queries t 
different input locations and the value of the function on the partial inputs is the same because the function 
is symmetric and the values in those t input locations are the same. □ 

We immediately obtain the following corollary. 

Corollary 10. For any e < 1/2, any read-once branching program computing MEDlANBiT/or n integers 
from [2n] requires size 2"^ ^ . 
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In particular this means that MedianBit is another example, after those in lUhll . of a problem showing 
the analogue of P 7^ NP n coNP for read-once branching programs. However, proving the analogous prop¬ 
erty even for read-twice branching programs remains open and will require a fundamentally new technique 
for deriving branching program lower bounds. 

The approach in all lower bounds for general branching programs (or even for read-A; branching pro¬ 
grams) computing decision problems ifTTl l20l |7J |2j [U |6l [H applies equally well to nondeterministic com¬ 
putation. (For example, the fact that the technique also works for nondeterministic computation is made 
explicit in ifTH .') Though this technique has been used to separate nondeterministic from deterministic com¬ 
putation 151 computing a Boolean function /, it is achieved by proving a nondeterministic lower bound for 
computing /. Since the nondeterministic oblivious read-once branching program computing the median has 
T = n and S = O(logn), the core of the median’s hardness, MedianBit, and its complement do not 
have non-trivial lower bounds; hence current time-space tradeoff lower bound techniques are powerless for 
computing the median. 

We conjecture that the lower bound T = ^(nloglog^n) holds for finding the median using general 
non-oblivious algorithms as well as oblivious and comparison algorithms. 
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A Proof of Lemma [I] 


The proof is inspired by that of the parallel repetition theorem. For (x, y) chosen according to V, we first 
design a public coin protocol in which the players randomly choose f G [/c], and with small probability of 
error jointly choose a random vector VF“* of values consisting of exactly one of Xj or yj for each j / i and 
a random message M for Alice consistent with those inputs whose distribution is close to that of Alice’s 
first message. The players then independently use the public coins to randomly complete their inputs to be 
consistent with T> on each coordinate (and in Alice’s case consistent with the message agreed upon). The 
resulting protocol will have expected error at most e + <5. We then fix the public coins (and hence all inputs 
other than (x, y)) to create the claimed deterministic protocol. 

Let Xi and for i G [k] denote the random variables associated with the components of the distribution 
Let M denote the random variable for Alice’s first message. Define the random variable Wi that is Xi 
with probability 1/2 and Yj with probability 1/2. Let W denote the random variable Wi... Wk and VF“* 
denote the variable W with Wi removed. Then 


mi > H{M) 

> I{M; XiYi... XkYk\W) by definition 

k 

>Y,IiM;XiYi\W) 

i=l 

since the XWi are conditionally independent given W 

k 

= Y,HM;XiYi\WiW-^) 

i=l 
k 

i=l 
k 

i=l 
k 

= E 

i=l 

by the chain rule 


7(M; XiYi\X,W-^) + /(M; XiYi\YiW-^) 


I{M- Y,\XiW-^) + I{M- Xi\YiW-^) 


by definition of Wi 


I{MW-^-Yi\X,) - I{W-^; Y,\Xi) + I{MW-^; Xi\Yi) - I{W-^-Xi\Yi) 


I{MW-^-Yi\X,) + I{MW-^-Xi\Yi) 


= E 

i=l 

since is independent of XiYi. 


Since mi < 6‘^k/{8ln2), it follows that Ei^[k]{I{MW-^-,Yi\Xi) + I{MW-^] Xi\Yi)) < 57(4In2). 
We use this to derive that in expectation over random choices of i, and (x, y) chosen from V, the distributions 
MVF“*|Xj=x and MW~'^\Yi=y are both statistically close to the distribution MW~'^\Xi=x,Yi=y. We 
now use the following proposition which follows from Pinsker’s inequality. 

Proposition 11. Let P and Q be probability distributions. 

Then¥.q^Q\\P - {P\Q=q)\\^ < l^^I{p■Q). 
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It follows that 


^^a.,y)^v\\{MW-^\Xi=x) - {MW-^\Xi=x,Yi=y)\\^ 

= ¥.^'Ey,(^^y^^Ty\x=x\\{MW-^\Xi=x) - {MW-^\Xi=x,Yi=y)f 
2 

< —^¥.xI{MW~'']Yi\Xi=x) by Proposition [TT] 

=—^I{MW~'^]Yi\Xi) by definition. 

The analogous bound shows that the expeeted square of the statistical distance between the distributions 
MW-^\Yi=y and MW-^\Xi=x, Yi=y is at most Xiiy*). 

Therefore in expectation over choices of (x, y) and i, the sum of the squares of the statistical dis¬ 
tances between the distributions MW~^\Xi=x and MW~^\Xi=x,Yi=y and between MW~^\Yi=y and 
MW-^\Xi=x,Yi=y is at most 5^/8. Write ex,y,i,x = \\{MW-^\Xi=x) - {MW-^\Xi=x,Yi=y)\\ and 
^x,y,i,Y = \\{MW~'^\Yi=y) — {MW~'^\Xi=x,Yi=y)\\. In particular we have 

^{x,y)r^'D^i£lk]{^x,y,i,X ^x,y,i,Y^ 

< <5V8- 

and hence 

^{x,y)'^'D^i&[k]{^x,y,i,X Y ^x,y,i,Y) Yi 5/2. 

We now apply Holenstein’s Lemma ifTSl to say that given x, y and i, except for a failure probability of 
at most 2{ex,y,i,x + (-x,y,iy), Alice and Bob without any communication can use the shared random string 
to agree on a sample (m, w~'^) from MW~^\Xi=x, Yi=y. Therefore the expected failure probability is at 
most 5. 

Once Alice and Bob have selected (m, Alice uses private randomness to sample the remainder of 
her input from Xi... Xk\M=m, iy“*=m“*, Xi=x. Bob independently uses private randomness to sample 
the remainder of his input from 


yi... yfc|M=m, W-^=w-\Y=y 

= Y,...Yk\W-^=w-\Y,=y 


which only depends on V^. 

Then Alice and Bob simulate the remainder of the protocol starting with the second message overall; i.e., 
Bob’s first message. The difference in the distribution from on the result is at most 5 so the expected 
error is at most e + J. 
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